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Abstract 

We consider content distribution in vehicular ad hoc networks. We assume that a file is 
encoded using fountain code, and the encoded message is cached at infostations. Vehicles 
are allowed to download data packets from infostations, which are placed along a highway. 
In addition, two vehicles can exchange packets with each other when they are in proximity. 
As long as a vehicle has received enough packets from infostations or from other vehicles, 
the original file can be recovered. In this work, we show that system throughput increases 
linearly with number of users, meaning that the system exhibits linear scalability. Further- 
more, we analyze the effect of mobility on system throughput by considering both discrete 
and continuous velocity distributions for the vehicles. In both cases, system throughput is 
shown to decrease when the average speed of all vehicles increases. In other words, higher 
overall mobility reduces system throughput. 
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1 Introduction 



Peer-to-peer file sharing has been a new paradigm for content distribution over the Internet for 
many years. It differs from the traditional client-server architecture in that after a client starts 
downloading a file from one or more servers, it itself becomes a server, able to serve other users' 
requests. In other words, each participating node plays the role of both server and client at the 
same time. There is no single bottleneck in the system; the capacity grows when more nodes 
participate, resulting in endless scalability. The best example is the highly popular Bit Torrent 
system [3]. It splits the sharing file into small blocks. Users download missing blocks from 
their peers. Once downloaded, those blocks become available to their peers. 

Although many peer-to-peer file sharing systems were developed for wired network, they 
may not be suitable for all kinds of wireless systems. For example, in wireless ad hoc networks, 
a packet typically has to traverse multiple hops from source to destination. It was shown in 
[6] that per-node throughput changes at a rate 0(l/vn Inn), which drops to zero for large n. 
Consequently, the multi-hop strategy is intrinsically unscalable, no matter what protocols are 
used in the network layer and above. To maintain scalability, a two-hop relay strategy was 
considered in [5]. It was shown that with node mobility, per-node throughput becomes 0(1). 
As a result, system capacity scales linearly with the number of nodes. This drastic difference 
motivates the design of many mobility-assisted data transfer protocols. Some examples are 
presented in [131 E] • 

In this work, we focus on vehicular ad hoc networks (VANET), which consists of cars, trucks, 
motorcycles, and all sorts of vehicles on the road. A major characteristic of VANET is its highly 
dynamic topology. Nodes are intermittently connected when they encounter one another on 
the road. If traffic density is low, the proportion of time that a node is connected to another 
node may be small, which may result in large delay. On the other hand, the instantaneous 
transmission rate can be very high, especially if transmission proceeds only when two nodes 
are close to each other. Due to its nature, VANET is particularly suitable for delay tolerant 
applications with large bandwidth requirement. An example is that a content provider allows 
its subscribers to download movies, music, or news from an infostation at the roadside when 
they pass by, and to exchange contents among themselves when they encounter one another 
on the road. A user can simply run an application program in the background, without the 
aware of download schedule. To facilitate the development of these applications, a content 
distribution protocol is needed. A Bit Torrent-like protocol called Car Torrent was proposed 
in [12] . Two other protocols were designed in [21 [8] based on the idea of network coding [H [9] . 
In this work, we adopt the fountain code approach [11]. Encoding is performed at infostations 
but not at vehicles. This method can reduce processing time at vehicles, and reduce decoding 
complexity if a suitable fountain code is used. 

The contributions of this work are these: First, the application scenario is modeled, which 
reveals the relationship between coding, delay, and throughput. Second, exact formulae for 
throughput are derived, from which insights on how mobility affects throughput can be gained. 
Our approach is similar to that in [15], but with some major differences in modeling. 
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2 Content Distribution for Vehicular Network 



We consider a one-dimensional vehicular network, which models the scenario where many 
cars are running on a highway. Suppose that a portion of car users subscribes to a content 
distribution network. They are interested in downloading a common file from the content 
provider. The file is split into K smaller blocks W\, W2, • • • , Wk, each of which consists of L 
bits. These messages blocks are cached in infostations [4], placed along the highway. When 
a car comes close to an infostation, it can download message blocks from it. Besides, a car 
can exchange message blocks with another car in proximity. We refer a car or an infostation 
as a node and say that a node encounter occurs when two nodes are approaching to within a 
transmit range r from each other. Data exchange between the two nodes then begins. The 
amount of data exchange depends on the transmission bit rate Rb and the connection time. We 
assume that non-adaptive radio is used so that Rb is constant throughout the encounter period. 
We also assume channel coding is used so that the probability of decoding error is negligible. 

We adopt a fountain code approach [11] for file distribution at infostations. When a car 
is within the transmit range of an infostation, the infostation generates and transmits some 
encoded messages to the car. Each encoded message is obtained by linearly combining the 
original message blocks: 

K 

^2c k W k , (1) 
k=i 

where each c k is either or 1, and the addition is performed over F2. The vector c = 
(ci, C2, . . . , cjc) is called the encoding vector, which is generated randomly. There are vari- 
ous ways to generate it. One simple way is to pick a vector uniformly at random over Wff . 
Another way is to generate it according to the robust soliton distribution in LT codes [IP] . 
Each packet consists of an encoded message as in (pQ) and the corresponding encoding vector. 

The protocol that we propose for packet exchange follows a two-hop strategy. When two 
cars are within the transmit range of each other, they will exchange those packets that are 
directly downloaded from infostations. Those packets that are received from other cars will 
not be forwarded again. In other words, each packet is transmitted in at most two hops: from 
an infostation to a car, and from that car to another car. 

A vehicle can recover the original file if the encoding vectors in the received packets span 
the vector space F^", which happens when K linearly independent encoding vectors have been 
received. Indeed, if c±, C2, . . . , cr- are encoding vectors that are linearly independent, the file 
can be decoded by inverting the K x K matrix whose ith row is Cj for i = 1, 2, . . . , K . 

3 Throughput Analysis 

We assume that cars arrived at the highway follow a Poisson process with rate A. Each of them 
travels in the highway at constant velocity. Those coming from the left has positive velocity and 
are collectively called the forward traffic. Those coming from the right has negative velocity 
and are called the reverse traffic. 

In the highway, two nodes are connected if their distance is less than or equal to the transmit 
range, denoted by r. The connection time between two cars, T c , depends on their relative speed 
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and is given by 



Note that the difference of velocity v — v' may be negative, and the sign depends on their 
directions. The maximum number of packets that can be exchanged during an encounter is 
R p T c , where R p is the transmission rate in packets per second and is equal to Rb divided by 
the packet size in bits. Likewise, a car of velocity v can download R p r/\v\ packets from an 
infostation in one encounter. 

We assume that there is an infostation at every entrance of the highway. When a vehicle 
enters the highway system, it collects some encoded message blocks from the infostation. As 
cars usually enter the highway at low speed, they should have picked up enough packets to 
be exchanged during any future encounter. When two nodes of velocities v and v 1 meet each 
other, we assume that the number of packets transmitted in each direction is R p r/{2\v — v'\). 

Since the velocity of each node is assumed constant, two nodes meet each other at most once 
as they travel along the highway. We can therefore guarantee that any newly received packet 
by a car is statistically independent of the packets already stored in its buffer. Consequently, 
the packets received by a vehicle are all statistically independently. 

The number of packets that must be received before K linearly independent encoding 
vectors are obtained depends on the probability distribution of encoding vectors. Based on 
the assumption that the received encoding vectors are statistically independent, the following 
results apply: If the distribution of encoding vectors is uniform, the original file can be decoded 
with probability 1 — e, for some small constant e, after K + log 2 (l/e) packets are received. If we 
use LT code with robust soliton distribution, the number of packets needed is K + 2S\og 2 (S/ e), 
where S = c\f~K \og e (K j 5) and c is a parameter of order 1 [11]. Given the probability of 
decoding failure e, the downloading time is obtained by dividing the required number of packets 
by the packet rate. Our objective is to estimate the average downloading time of the file in 
VANET by analyzing the packet rate. In the sequel, we will call it throughput. We will first 
consider the case where the velocity distribution is discrete, and then extend the results to the 
continuous case at the end of this section. 



3.1 Discrete Velocity Distribution 

Suppose that the velocity V of a vehicle can take on values from a finite set, {t>i, i>2, • • • , vm}, 
with probability pi,P2, ■ ■ ■ ,Pm respectively, where ^=1?™ = Denote the set {1,2,..., M} 
by Ai. This model is applicable to the scenario where the traffic is heavy and nodes using 
different lanes are of different speeds. A node, when entering the highway, can choose a suitable 
lane. 

We consider a specific node, called the observer node, or simply the observer, traveling 
between a segment of highway between two consecutive infostations A and B. We will analysis 
the throughput of the observer in this segment of the highway. 

Suppose that the observer belongs to class i for some i G M, and moves at speed Uj in 
the forward direction from A to B. Assume that the length of this segment of the highway is d. 
The traveling time of the observer in this segment is given by t{ = d/vi. We denote Ni as 
the number of node encounters for the observer when traveling in this segment of highway. 
Furthermore, for k = 1, 2, . . . , Ni, we denote Bi(k) as the number of packets received from the 
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fc-th encounter. Assuming that the observer does not encounter two other nodes at the same 
time, the total number of packets received by the observer in this highway segment is 



1 k=l 



(3) 



The first term corresponds to the packets directly downloaded from infostation A and the second 
the total number of packets from other vehicles. 

In order to find the expected value of Bi, we split the Poisson arrival process into M 
independent Poisson streams with rate p m X, where m = 1,2,..., M. Let N^ m be the number 
of encounters of the observer with nodes in class m, so that 



where t m = d/v m . 

Proof. Without loss of generality, suppose the observer enters the highway segment at time 
and departs at time t, L . We consider its encounter with forward traffic and reverse traffic 
separately. 

For forward traffic, consider a node of velocity v m > 0, which enters the highway segment 
at time t and departs at time t + t m . Suppose the speed of the node is lower than that of the 
observer, that is, t m > ij. It will encounter the observer if and only if it enters the highway 
before the observer does (i.e., t < 0) and it departs the highway after the observer does (i.e., 
t + t m > ti). In other words, an encounter occurs if and only if — (t m — ti) < t < 0. Since 
the arrival process is Poisson with rate Xp m , the number of encounters is Poisson distributed 
with mean Xp m (t m — ti). Next suppose t m < ti. An encounter occurs if and only if the node 
enters after time (i.e., t > 0) and it departs before ti (i.e., t + t m < ti). Again the number of 
encounters is Poisson distributed with mean Xp m \t m — U\. 

For reverse traffic, consider a node of velocity v m < 0. If it enters the highway before time 
(i.e., t < 0), it will encounter the observer if t + \t m \ > 0. If it enters the highway after time 
(i.e., t > 0), it will encounter the observer if it enters before U (i.e., t < ti). Combining the two 
cases, we can see that an encounter occurs if — \t m \ < t < ti. Hence, the number of encounters 
is Poisson distributed with mean also equal to Xp m \t m — U\. □ 

Let M-i be the set M \ {i}. We next obtain an expression for the mean of Bi. 



Ni = N iA + N i)2 + ... + N iM - 



(4) 



The following lemma gives the expected value of N^. 



Lemma 1. N ijTn is Poisson distributed with mean 



E[Ni jm ] — Xp m \t m — U 



(5) 



Lemma 2. 




m£M-i 



rn 



(6) 
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Proof. The observer will only encounter a node in class m for m / i. When the observer meets 
another node of velocity v m , v m 7^ the number of packets received is equal to R p r / (2\v m — Vi\). 
We sum over all m£M_; and obtain 



Bi 



V; 



meM- 



^i,mRpf 
2\v m - Vi\ 



(7) 



Taking expectation and using Lemma [H we have 

E[Bi. 



Rpr + E[N i)m \R p r 



V; 



Rpr 



R p r 



2 \v n 



Vi 



1 ^Pm\tm ~ 

ivi j-f 2\v m - v, 
77 2^ 



meM- 



2d 



(8) 
(9) 

(10) 
□ 



Define Cj = -Bj/tj as the average throughput of the observer during its traveling time on 
the highway segment. Then we have 



E[Ci] 



Rpr r 
d 



1 + 



A 



^ Pm\tr 



(11) 



meM- 



Consider a particular time instant t. A car of velocity v m will be on this highway segment if 
it enters this segment within the interval [t — \t m \,t}. Therefore, the number of cars of velocity 
v m that are on the highway is Poisson distributed with mean equal to Xp m \t m \. The above 
equation can be rewritten in terms of car density as follows: 

Theorem 3. Let p m = Ap m |i m |/d = Xp m /\v m \ be the density of cars of velocity v m . Then 



E[C l ]=Rpr\- d + \ Y, 



(12) 



meM-i 



The first term within the parenthesis in Theorem [3] can be regarded as the density of 
infostation in the highway segment, and the second term is the sum of car densities over all 
classes except the observer's class. It is interesting to find that the individual throughput 
depends only on the density of other nodes. Note that the density of nodes belonging to the 
same class is irrelevant because there will not be any intra-class encounter. 

The per-node throughput can also be expressed as 




We observe the following: 
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• Low-Density Gain: The class of cars that has the lowest density get the largest average 
per- node throughput. 

• High-Speed Gain: If the speed distribution is equiprobable, i.e., p\ = p% = ■ ■ ■ = Pm, then 
the faster the car, the higher average throughput it gets. 

Now let C be the average per-node throughput. By averaging the per-node throughput in 
Theorem [3] over all velocity classes, we have 

M 

E[C]=Y,PiE[Ci] (15) 



i=i 



R p r 



1 M 1 

i=l m£M-i 



which can be rewritten as follows: 
Theorem 4. 



M 



E{C]=R p rl-- P - + -Y J Pm\ (17) 
\ m=l / 

= M3 + ^^(r + r)]' (18) 

where p = YLiViPi- 

Note that system throughput varies linearly with C and can be obtained by multiplying C 
with number of users. Based on the above result, the following facts can be observed: 

• Incrementally Linear Scalability: The average per-node throughput increases with the 
node arrival rate, A, in an incrementally linear fashion. 

• Mobility Reduces Throughput: If all cars move faster, then the average per-node through- 
put decreases. For example, suppose all cars double their speeds. Then the car density 
of each velocity class decreases by one half. According to Theorem [31 the throughput of 
all users decreases. Hence the system throughput decreases. 

Although the velocity of the cars cannot be controlled by the system, it is interesting to 
know which probability mass function maximizes system throughput, for a given velocity vector 
{uij «2 3 • • • i vm}- We answer this question in the Appendix. 



3.2 Continuous Velocity Distribution 

The analysis for discrete velocity can be extended to the case where the velocity distribution 
is continuous. This model, called the wide motorway model in [7], is applicable to the scenario 
where there are multiple lanes and moderate traffic. Since nodes can overtake others at different 
lanes, there is no interaction among the nodes even if they travel in the same direction. A node 
can have any speed the driver likes, subject to the speed limit. 
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Suppose that the velocity V is a continuous random variable, whose probability density 
function is fv(v), defined for v £ [a,b]. We divide the interval [a, b] into many intervals, each 
of length Av. Each interval is approximated by a constant function. We assume that fy 
is Lipschitz continuous, so that we can approximate fy as close as we like by increasing the 
number of intervals. The next theorem is analogous to Theorem [3] and 01 

Theorem 5. Let Cj denote the throughput of a particular observer node with velocity Vi, and C 
the average per-node throughput. Let N be the number of cars in a highway segment of length d. 
Then, for all i, 



E[C]=E[C i ]=R p r(^ + -E 



1 

W\ 



(19) 



1 1 E[N] 



= M5 + 2^rJ- < 20 > 

Proof. As the number of intervals that partition [a, b] approaches infinity, we can rewrite (|12p 

as 



E[d] = R p r{- d + ^ ^f v ( v )± dv ), (21) 



which is equal to the right hand side of (|19j) . Let T be the random variable d/V, which is the 
duration that a car of velocity V stays in this highway segment. Conditioned on the velocity, 
the number of cars of velocity V is Poisson distributed with mean £7[iV|V] = A|T|. Hence 

E[d] = R p r(± + ± jf* fy(v)^dv) (22) 

which is (f20|) . Since E[Cj\ is independent of the velocity of the observer, we have E[C] = 
E[Ci\. □ 

Note that E[N]/d is the car density on the highway. Based on this theorem, we have the 
following observations. The first one is the same as that in the case of discrete speed. The 
second one is similar but not exactly the same. The last two observations are different. 

• Incrementally Linear Scalability: The average per-node throughput increases with the 
node arrival rate, A, in an incrementally linear fashion. 

• Mobility Reduces Throughput: The average per-node throughput changes at a rate 0(.E[1/| V|]). 
It means that the higher the mobility, the lower the car density, and the lower the average 
per-node throughput. 

• Perfect Fairness: E[Ci\ is independent of Vi. It means that given the same background 
traffic on the highway, the throughput of a node is independent of its own speed. In other 
words, all nodes yield the same average throughput, which is different from the case of 
discrete speed. 

• Equivalence of Forward and Reverse Traffics: The average throughput of a particular node 
yielded by encountering with forward traffic is the same as that yielded by encountering 
with reverse traffic, provided that the arrival rates and speed distributions of the two 
directions are the same. 
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4 Concluding Remarks 



We have analyzed the effect of mobility on the performance of a VANET. Based on the Poisson 
arrival process, we derive simple formulae for throughput under both discrete and continuous 
velocity distribution. There are two major results: First, system throughput increases linearly 
with the arrival rate of vehicles. In other words, the system is linearly scalable. Second, 
system throughput decreases when all vehicles increase their speeds, implying that higher 
overall mobility is not beneficial. 

We have also investigated the throughput of individual users. For the discrete velocity case, 
the class of users having higher mobility and lower density has higher throughput. In contrast, 
for the continuous velocity case, all users have the same throughput. 

In our analysis, we assume that at most two cars meet each other at any time instant. If 
the traffic density is high, this assumption may not hold. For example, a node can overhear the 
transmission of other nodes. However, the performance of the system in such a scenario depends 
on the details of a particular transmission protocol, such as how transmission is initiated and 
how transmission conflicts are resolved. This is not within the scope of our framework and we 
leave it for future research. 



A Optimal Probability Mass Function for the Case of Discrete 
Velocity 

Given {v±, V2, ■ ■ • , vm}, we would like to know what the optimal probability mass function is. 
The problem can be formally stated as follows. 



Maximize F(p) = ^ pjpj ( 7— r + 7— r 

■/ ■ V \ V i\ \ V j\ 

M 

subject to ^2 Pm = 1, and Pi > Vi. 



m=l 

To check whether this is a convex optimization problem, we first eliminate the equality 
constraint. We consider the function 

M-l 

G(pi, . . . ,p M -i) = F(pi, . . . ,p M -i, 1 - ^2 Pi)- (24) 

i=i 

It can be shown that the Hessian of G is given by 

_2diag( 1 — r ,..., 1 — — ^--j— rJjif-i, (25) 

\Vl\ \VM-l\ \VM\ 

where diag(a:) is the diagonal matrix with diagonal elements given by x, and J n is the n x n 
all-one matrix. Since the first matrix is negative definite and the second one is negative semi- 
definite, the Hessian of G is negative definite. Hence, the function G is strictly concave, and 
there is one unique optimal point, p*. 

Proposition 6. If \vi\ < jt^l < • • • < \v m \, then at the optimal point p* , we must have 

Pi > P2 ■ ■ ■ > PM- (26) 
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Proof. Suppose p* < Pj, where i < j. Consider another point p' , with all components the same 
as p* except the i-th. and the j'-th components swapped. By definition, it can be seen that 
F(p*) < F{p'). □ 



Now we try to solve the optimization problem. Introduce the Lagrangian multipliers Aj's, 
i = 1, 2, . . . , M, for the non- negative constraints and v for the equality constraint. The KKT 
conditions after eliminating Aj's become 

Pi(y ~ ^Pkocik) = i = 1, . . . , M, (27) 

M 

Y^p m = l, and pi > i = l,...,M, (28) 

m=l 

where a ik = 1/|^| + l/\v k \. 

Let A be the MxM matrix whose diagonal components are all zero and (i,j)-th component 
equal to otik for i ^ k. Let A k be its leading principal submatrix of order k, that is, its last 
M — k rows and M — k columns are deleted. Let be the k x k all-one vector, and p k be the 
first k components of p. Without loss of generality, we assume that |^i| | V2 \ — * * * — \vm\- 
The optimal solution can be found by the following algorithm. 

Algorithm 1. (Initialization) Let n := M . 

1. Compute 



where \\ ■ \\i is the l\ norm 

2. If all components of p n are non-negative, then output p M . Otherwise, let p n := and 
n := n — 1. Repeat step 1. 

This algorithm produces the correct solution because it simply tries to solve (|27j) . assuming 
that pi for all i. Note that the multiplier v is adjusted such that the equality constraint is 
satisfied. This corresponds to the normalization factor in the algorithm. If all the components 
are non-negative, then the KKT conditions are satisfied. Otherwise, one of the p^s must be 
zero because of (p7|) . By Proposition [6j the last component must be zero. Hence, we reduce 
the dimension of the problem by one and then repeat. 

For example, we consider the situation where the nodes can be divided into five classes. 
Given the velocity values, we can compute the optimal probability mass function. Three 
examples are shown in Table [TJ It can be seen that some of the p^s can be equal to zero. 
However, this occurs only for some extreme cases. We have tested many other cases. Typically, 
all pj's will be greater than zero. 

Besides, it can be shown that at least two classes must have probabilities strictly greater 
than zero, for otherwise no encounter in the system can occur. According to Proposition [6j 
they must be p\ and p2- If there are only two classes or pi = for i > 3, then it can be easily 
shown that p\ = p\ = 0.5 is optimal. 



Pn 



A x 1 



n 1 
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Table 1: Optimal Probability Mass Functions for Different Velocity Values. 



Speed 


80 


90 


100 


110 


120 


Distribution 


0.26 


0.23 


0.2 


0.17 


0.14 


Speed 


50 


60 


70 


80 


130 


Distribution 


0.3077 


0.2692 


0.2308 


0.1923 





Speed 


20 


30 


40 


110 


120 


Distribution 


0.3889 


0.3333 


0.2778 
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